A hitchhiker's guide to CUDA programming
๐ฏGPU Kernels
Flag this post
Show HN: GPU-accelerated sandboxes for running AI coding agents in parallel [video]
๐NCCL
Flag this post
TIL: For long-lived LLM sessions, swapping KV Cache to RAM is ~10x faster than recalculating it. Why isn't this a standard feature?
๐ฒLoop Tiling
Flag this post
Structurally Valid Log Generation using FSM-GFlowNets
arxiv.orgยท1d
๐ONNX
Flag this post
Utilizing Chiplet-Locality For Efficient Memory Mapping In MCM GPUs (ETRI, Sungkyunkwan Univ.)
semiengineering.comยท2d
๐Occupancy Optimization
Flag this post
Challenging the Fastest OSS Workflow Engine
๐งPTX
Flag this post
The Hidden Ledger of Code: Tracking the Carbon Debt Inside Our Software
hackernoon.comยท5h
๐๏ธBuild Optimization
Flag this post
The next RISC-V processor frontier: AI
edn.comยท1d
๐ง CPU Architecture
Flag this post
I tested Arc Raiders across four GPUs of different ages โ optimization still exists
xda-developers.comยท2h
๐งPTX
Flag this post
Where to Buy or Rent GPUs for LLM Inference: The 2026 GPU Procurement Guide
๐ฏGPU Kernels
Flag this post
Loading...Loading more...